Domain-specific Word Prediction for Augmentative Communication
نویسندگان
چکیده
Many augmentative communication systems employ word prediction to help minimize the number of user actions needed to construct messages. Statistical prediction techniques rely upon a database (model) of word frequencies and inter-word correlations derived from a large text corpus. One potential means to improve prediction is to create a set of models derived from domain-specific corpora, dynamically switching to the model most appropriate for the current conversation. Using telephone transcripts to generate prediction models for 20 different topic domains, we have observed a clear benefit to including domain-specific models in an overall prediction scheme. BACKGROUND Statistical word prediction systems for augmentative communication commonly utilize both word frequencies and inter-word correlations (word contexts). An ngram prediction model utilizes the past n-1 words to predict the nth (current) word. Easily derived from large samples of text, ngram models can provide impressive prediction performance – Lesher (1) reports on a trigram (n=3) model derived from a 3 million word corpus that yielded keystroke savings in excess of 54%. There have been numerous techniques suggested for enhancing traditional ngram word prediction, including recency, syntactic analysis, and syntax-based ngrams. One technique that has not been fully explored is the use of domain-specific ngram models – models derived from text samples that are focused on distinct subjects or genres. In theory, these ngram models could be dynamically swapped in and out of use to match the direction of an ongoing conversation. The text used to train a word prediction system should match as closely as possible the kind of messages produced by the augmented communicator. Although core vocabulary stays fairly constant (2), fringe vocabulary may change substantially through the course of a day as different topics and settings are encountered. The same is likely to hold true for inter-word correlations. We know of no studies that have attempted to quantify the effect of domain shifts on word prediction efficacy. As a precursor to developing a system that can automatically shift between appropriate domain-specific models, we undertook to find the keystroke savings possible in such a system. Utilizing transcripts from the Switchboard Corpus, a series of 2,400 telephone conversations organized into approximately 60 topic domains (for example, recycling, food/cooking), we have quantified the performance gains associated with utilizing domain-specific ngram models. Although this corpus does not involve augmented communicators, it is conversational, large, and organized into specific topical domains – by far the most suitable large corpus currently available. RESEARCH QUESTION The question we addressed is: Can the use of domain-specific ngram models appreciably enhance word prediction performance in the context of augmentative communication? While our early studies indicated that database domain specificity did not play a significant role in system performance, recent pilot studies indicated that this question merited a more focused investigation. METHODS We chose to study the 20 most frequently occurring topic domains in the Switchboard Corpus. The testing texts for each of these 20 target domains was generated by concatenating all conversations of that domain from the first 12.5% of the corpus. The remainder of the corpus was used to generate the domain-specific training texts. We generated two other training texts: 1) ‘Small’, consisting of 5% of the training text for each of the 20 target domains, resulting in a text approximately the same size as the average of the 20 domain-specific training texts; and 2) ‘Big’, comprised of the entire training text. Trigram models were created for each of the 22 training texts. The experiments were carried out using our IMPACT augmentative communication software. Running in emulation mode, this system can simulate a human using its interface to produce a message. The testing interface consisted of a standard QWERTY keyboard augmented by a dynamic 6-word prediction list. Keystroke savings (KS) were used as the performance measure. Domain-Specific Word Prediction RESULTS We measured the performance of four prediction model configurations on each of the 20 domain-specific testing texts. The four configurations were: 1) ‘Small’ only; 2) ‘Auto’, meaning that the prediction model was derived from the same domain as the testing text; 3) ‘Big’ only; and 4) ‘Big+Auto’, a blending of two ngram models. Figure 1 shows performance on 10 representative domains. Table 1 shows average performance over all 20 domains for the four configurations. Not surprisingly, the ‘Big+Auto’ configuration, with its equally weighted general and specific components yielded the best results, followed in turn by ‘Big’, ‘Auto’, and ‘Small’. The almost 2% advantage of ‘Big’ over ‘Auto’ is also reasonable given its much larger training text size. However this effect is also due to the fact that the conversants were generally not experts in these domains. This boosts the relative importance of those testing text statistics correlated with conversation in general and the ‘Big’ training text constitutes a fairly large and therefore reliable sample of such general conversational statistics. The ‘Small’ training text is only about 1/40 the size of the ‘Big’ training text, yet it covers a significant fraction of the domains that ‘Big’ does, thus rendering it a far less reliable sample of general conversational statistics. This accounts for our most interesting result which is the nearly 3% advantage of the ‘Auto’ configuration over the comparably-sized ‘Small’ configuration. Thus, for a given model size, using a model derived from text of the same domain as the testing text yields better prediction than using a model derived from a more general pool of text. As noted earlier, this is because the ‘Auto’ models provide a better match between the training and testing text word usage patterns. Finally, we emphasize that the ‘Big+Auto’ configuration exceeds the ‘Big’ configuration by nearly 1%, despite the fact that it is only very slightly larger than the ‘Big’ configuration. This reinforces our main finding of the benefit of using domain-specific prediction models and suggests that as we consider larger and larger total model storage capacities, we expect the greatest incremental improvements to result from additions of domain-specific training text rather than of general text. This is the subject of ongoing studies. DISCUSSION Since it appears that domain-specific databases can provide substantial improvements in word prediction, where can appropriate databases be found? Existing corpora such as the Brown, Switchboard, and British National Corpora consist of text categorized roughly along various domain boundaries – topic, genre, sophistication, etc. By dividing these corpora along these categories, a series of baseline domain-specific models could be derived. Our research team is also investigating the feasibility of culling appropriate databases from the internet using an autonomous “web crawler” (3). While the web offers a wealth of text – perhaps as much as a trillion words – this text varies widely in content, style, and sophistication. We have developed a prototype web crawler capable of searching out and retrieving specific genres of text. Such a system opens up exciting new possibilities for domain-specific word prediction since it can potentially produce very large databases – an important determiner of word prediction accuracy (1). This paper has focused on prediction databases specific to a particular topic domain. The model can clearly be extended to other domain classification schemes such as style, formalness, or genre. For example, at different points during a day, a student might be working on an essay for class, a work of fiction, and a letter to a friend. By switching Figure 1: Comparison of Predictive Models 49.00% 51.00% 53.00% 55.00% 57.00% 59.00% 61.00% 1 2 3 4 5 6 7 8 9 10 Domain K ey st ro ke S av in gs Big+Auto Big
منابع مشابه
Improving word prediction for augmentative communication by using idiolects and sociolects
Word prediction, or predictive editing, has a long history as a tool for augmentative and assistive communication. Improvements in the state-of-the-art can still be achieved, for instance by training personalized statistical language models. We developed the word prediction system Soothsayer.1 The main innovation of Soothsayer is that it not The system is available as an interactive demo at htt...
متن کاملThe Effects of Word Prediction on Communication Rate for AAC
Individuals using an Augmentative and Alternative Communication (AAC) device communicate at less than 10% of the speed of “traditional” speech, creating a large communication gap. In this user study, we compare the communication rate of pseudo-impaired individuals using two different word prediction algorithms and a system without word prediction. Our results show that word prediction can incre...
متن کاملThe effect of context priming and task type on augmentative communication performance.
Augmentative and Alternative Communication (AAC) devices include special purpose electronic devices that generate speech output and are used by individuals to augment or replace vocal communication. Word prediction, including context specific prediction, has been proposed to help overcome barriers to the use of these devices (e.g., slow communication rates and limited access to situation-relate...
متن کاملThe role of word order in the interpretation of canonical and non-canonical graphic symbol utterances: A developmental study.
Graphic symbols are often used to represent words in Augmentative and Alternative Communication systems. Previous findings suggest that different processes operate when using graphic symbols and when using speech. This study assessed the ability of native speakers of French with no communication disorders from four age groups to interpret graphic-symbol sequences of varying length and canonicit...
متن کاملCombining word prediction and r-ary Huffman coding for text entry
Two approaches to reducing effort in switch-based text entry for augmentative and alternative communication devices are word prediction and efficient coding schemes, such as Huffman. However, character distributions that inform the latter have never accounted for the use of the former. In this paper, we provide the first combination of Huffman codes and word prediction, using both trigram and l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001